Washington State BDC Challenge Analysis¶
This will be a quick analysis of data from the BDC for the State of Washington based on data released from the FCC.
All of this data was downloaded from here and will be included in the repository associated with this analysis. We will work in reverse cronological order.
# Housekeeping stuff.
!python -m pip install --upgrade pip
!pip install --upgrade -r requirements.txt > output.txt
import zipfile, pathlib, itables
import pandas as pd
import numpy as np
import geopandas as gpd
import contextily as cx
import matplotlib.pyplot as plt
from pprint import pprint
Requirement already satisfied: pip in c:\users\nick.pappin\projects\wafabricchallengereview-20230808\venv\lib\site-packages (23.2.1)
Load Data¶
dfColumnHints = {
"location_id": "Int64",
"zip_code": "Int64"
}
jul23zip = zipfile.ZipFile("Washington - Fabric Challenge - Resolved - jul23.zip")
jul23 = pd.read_csv(jul23zip.open(jul23zip.filelist[0].filename), dtype=dfColumnHints)
jul23zip.close()
mar23zip = zipfile.ZipFile("Washington - Fabric Challenge - Resolved - mar23.zip")
mar23 = pd.read_csv(mar23zip.open(mar23zip.filelist[0].filename), dtype=dfColumnHints)
mar23zip.close()
dec22zip = zipfile.ZipFile("Washington - Fabric Challenge - Resolved - dec22.zip")
dec22 = pd.read_csv(dec22zip.open(dec22zip.filelist[0].filename), dtype=dfColumnHints)
dec22zip.close()
nov22zip = zipfile.ZipFile("Washington - Fabric Challenge - Resolved - nov22.zip")
nov22 = pd.read_csv(nov22zip.open(nov22zip.filelist[0].filename), dtype=dfColumnHints)
nov22zip.close()
C:\Users\nick.pappin\AppData\Local\Temp\ipykernel_29144\1898070276.py:6: DtypeWarning: Columns (10,13) have mixed types. Specify dtype option on import or set low_memory=False. jul23 = pd.read_csv(jul23zip.open(jul23zip.filelist[0].filename), dtype=dfColumnHints)
July - 2023¶
Let's start looking at the number of challenges that were successful and how many were overturned. The full data table will be at the end and the original CSV is in the corresponding zip file.
jul23.adjudication_status.value_counts(dropna=False)
adjudication_status Overturned 159956 Upheld 21867 Name: count, dtype: int64
And let's look at it with the category code.
jul23.pivot_table('challenge_id', index='category_code_desc', columns='adjudication_status', aggfunc="count")
| adjudication_status | Overturned | Upheld |
|---|---|---|
| category_code_desc | ||
| Add additional address for the location | 1 | 7 |
| Incorrect Location Address | 28 | 403 |
| Incorrect Location Building Type | 4432 | 13604 |
| Incorrect Location Unit Count | 2 | 18 |
| Location is Not Broadband Serviceable | 1956 | 617 |
| Location is Not Within Correct Building Footprint | 89 | 79 |
| Missing Broadband Serviceable Location | 153448 | 7139 |
Maps¶
First we are going to pull out the locations with no geographic information. This removed about 20K locations.
jul23geo = gpd.GeoDataFrame(jul23[jul23.latitude.notna() & jul23.longitude.notna()])
jul23geo = jul23geo.set_geometry(gpd.points_from_xy(jul23geo.longitude, jul23geo.latitude), crs='EPSG:4326')
jul23geo = jul23geo.to_crs(3857)
droppedChallenges = jul23.shape[0] - jul23geo.shape[0]
print(f'Before dropping challenges without coordinates: {jul23.shape[0]}')
print(f'After dropping challenges without coordinates: {jul23geo.shape[0]}')
print(f'Challenges dropped: {droppedChallenges}')
Before dropping challenges without coordinates: 181823 After dropping challenges without coordinates: 160755 Challenges dropped: 21068
With the no data challenges removed we get this map.
ax = jul23geo.plot(marker='.', markersize = 1, figsize=(15,9), alpha=.5)
cx.add_basemap(ax, crs=jul23geo.crs.to_string(), source=cx.providers.Stamen.TonerLite)
ax = ax.axis('off')
Let's look at the rejected challenges first.
rejected = jul23geo[jul23geo.adjudication_status == 'Overturned']
ax = rejected.plot(marker='.', markersize = 1, figsize=(15,9), alpha=.5)
cx.add_basemap(ax, crs=rejected.crs.to_string(), source=cx.providers.Stamen.TonerLite)
ax = ax.axis('off')
Not really much to see here aside from we challenged all over the state.
itables.show(rejected)
| challenge_id | fabric_vintage | category_code | category_code_desc | location_id | location_state | address_primary | city | state | zip_code | zip_code_suffix | unit_count | building_type_code | non_bsl_code | bsl_lacks_address_flag | address_id | latitude | longitude | date_received | withdraw_date | adjudication_status | adjudication_date | adjudication_code | adjudication_code_desc | fabric_response_location_id | geometry | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Loading... (need help?) |
itables.show(rejected.adjudication_code_desc.value_counts())
| count | |
|---|---|
| adjudication_code_desc | |
| Loading... (need help?) |
Unfortunatly with this dataset there is no real way to see or intuit failed challenges by FRN/Organization.
Here are the accepted challenges.
upheld = jul23geo[jul23geo.adjudication_status == 'Upheld']
ax = upheld.plot(marker='.', markersize = 1, figsize=(15,9), alpha=.5)
cx.add_basemap(ax, crs=upheld.crs.to_string(), source=cx.providers.Stamen.TonerLite)
ax = ax.axis('off')
Interesting to see where there were really successful challenge attempts this go around. And here is an interactive version.
upheld.explore()